55 research outputs found

    Toward a self-organizing pre-symbolic neural model representing sensorimotor primitives

    Get PDF
    Copyright ©2014 Zhong, Cangelosi and Wermter.This is an open-access article distributed under the terms of the Creative Commons Attribution License (CCBY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these termsThe acquisition of symbolic and linguistic representations of sensorimotor behavior is a cognitive process performed by an agent when it is executing and/or observing own and others' actions. According to Piaget's theory of cognitive development, these representations develop during the sensorimotor stage and the pre-operational stage. We propose a model that relates the conceptualization of the higher-level information from visual stimuli to the development of ventral/dorsal visual streams. This model employs neural network architecture incorporating a predictive sensory module based on an RNNPB (Recurrent Neural Network with Parametric Biases) and a horizontal product model. We exemplify this model through a robot passively observing an object to learn its features and movements. During the learning process of observing sensorimotor primitives, i.e., observing a set of trajectories of arm movements and its oriented object features, the pre-symbolic representation is self-organized in the parametric units. These representational units act as bifurcation parameters, guiding the robot to recognize and predict various learned sensorimotor primitives. The pre-symbolic representation also accounts for the learning of sensorimotor primitives in a latent learning context.Peer reviewedFinal Published versio

    Toward Abstraction from Multi-modal Data: Empirical Studies on Multiple Time-scale Recurrent Models

    Full text link
    The abstraction tasks are challenging for multi- modal sequences as they require a deeper semantic understanding and a novel text generation for the data. Although the recurrent neural networks (RNN) can be used to model the context of the time-sequences, in most cases the long-term dependencies of multi-modal data make the back-propagation through time training of RNN tend to vanish in the time domain. Recently, inspired from Multiple Time-scale Recurrent Neural Network (MTRNN), an extension of Gated Recurrent Unit (GRU), called Multiple Time-scale Gated Recurrent Unit (MTGRU), has been proposed to learn the long-term dependencies in natural language processing. Particularly it is also able to accomplish the abstraction task for paragraphs given that the time constants are well defined. In this paper, we compare the MTRNN and MTGRU in terms of its learning performances as well as their abstraction representation on higher level (with a slower neural activation). This was done by conducting two studies based on a smaller data- set (two-dimension time sequences from non-linear functions) and a relatively large data-set (43-dimension time sequences from iCub manipulation tasks with multi-modal data). We conclude that gated recurrent mechanisms may be necessary for learning long-term dependencies in large dimension multi-modal data-sets (e.g. learning of robot manipulation), even when natural language commands was not involved. But for smaller learning tasks with simple time-sequences, generic version of recurrent models, such as MTRNN, were sufficient to accomplish the abstraction task.Comment: Accepted by IJCNN 201

    Predictive Coding Based Multiscale Network with Encoder-Decoder LSTM for Video Prediction

    Full text link
    We present a multi-scale predictive coding model for future video frames prediction. Drawing inspiration on the ``Predictive Coding" theories in cognitive science, it is updated by a combination of bottom-up and top-down information flows, which can enhance the interaction between different network levels. However, traditional predictive coding models only predict what is happening hierarchically rather than predicting the future. To address the problem, our model employs a multi-scale approach (Coarse to Fine), where the higher level neurons generate coarser predictions (lower resolution), while the lower level generate finer predictions (higher resolution). In terms of network architecture, we directly incorporate the encoder-decoder network within the LSTM module and share the final encoded high-level semantic information across different network levels. This enables comprehensive interaction between the current input and the historical states of LSTM compared with the traditional Encoder-LSTM-Decoder architecture, thus learning more believable temporal and spatial dependencies. Furthermore, to tackle the instability in adversarial training and mitigate the accumulation of prediction errors in long-term prediction, we propose several improvements to the training strategy. Our approach achieves good performance on datasets such as KTH, Moving MNIST and Caltech Pedestrian. Code is available at https://github.com/Ling-CF/MSPN

    Encoding Multiple Sensor Data for Robotic Learning Skills from Multimodal Demonstration

    Get PDF
    © 2013 IEEE. Learning a task such as pushing something, where the constraints of both position and force have to be satisfied, is usually difficult for a collaborative robot. In this work, we propose a multimodal teaching-by-demonstration system which can enable the robot to perform this kind of tasks. The basic idea is to transfer the adaptation of multi-modal information from a human tutor to the robot by taking account of multiple sensor signals (i.e., motion trajectories, stiffness, and force profiles). The human tutor's stiffness is estimated based on the limb surface electromyography (EMG) signals obtained from the demonstration phase. The force profiles in Cartesian space are collected from a force/torque sensor mounted between the robot endpoint and the tool. Subsequently, the hidden semi-Markov model (HSMM) is used to encode the multiple signals in a unified manner. The correlations between position and the other three control variables (i.e., velocity, stiffness and force) are encoded with separate HSMM models. Based on the estimated parameters of the HSMM model, the Gaussian mixture regression (GMR) is then utilized to generate the expected control variables. The learned variables are further mapped into an impedance controller in the joint space through inverse kinematics for the reproduction of the task. Comparative tests have been conducted to verify the effectiveness of our approach on a Baxter robot

    A brief review of neural networks based learning and control and their applications for robots

    Get PDF
    As an imitation of the biological nervous systems, neural networks (NN), which are characterized with powerful learning ability, have been employed in a wide range of applications, such as control of complex nonlinear systems, optimization, system identification and patterns recognition etc. This article aims to bring a brief review of the state-of-art NN for the complex nonlinear systems. Recent progresses of NNs in both theoretical developments and practical applications are investigated and surveyed. Specifically, NN based robot learning and control applications were further reviewed, including NN based robot manipulator control, NN based human robot interaction and NN based behavior recognition and generation

    A multimodal human-robot sign language interaction framework applied in social robots

    Get PDF
    Deaf-mutes face many difficulties in daily interactions with hearing people through spoken language. Sign language is an important way of expression and communication for deaf-mutes. Therefore, breaking the communication barrier between the deaf-mute and hearing communities is significant for facilitating their integration into society. To help them integrate into social life better, we propose a multimodal Chinese sign language (CSL) gesture interaction framework based on social robots. The CSL gesture information including both static and dynamic gestures is captured from two different modal sensors. A wearable Myo armband and a Leap Motion sensor are used to collect human arm surface electromyography (sEMG) signals and hand 3D vectors, respectively. Two modalities of gesture datasets are preprocessed and fused to improve the recognition accuracy and to reduce the processing time cost of the network before sending it to the classifier. Since the input datasets of the proposed framework are temporal sequence gestures, the long-short term memory recurrent neural network is used to classify these input sequences. Comparative experiments are performed on an NAO robot to test our method. Moreover, our method can effectively improve CSL gesture recognition accuracy, which has potential applications in a variety of gesture interaction scenarios not only in social robots

    Iterative learning control based on stretch and compression mapping for trajectory tracking in human-robot collaboration

    Get PDF
    This paper presents a novel iterative learning control (ILC) scheme based on stretch and compression mapping for a robotic manipulator to learn its human partner’s desired trajectory, which is a typical task in the field of human-robot interaction. The proposed scheme is used to reduce the interaction force between the robot and the human partner in repetitive learning process. Thus, the robot can track the human partner’s repetitive trajectory with a small interaction force, leading to little control effort from the human. As the human is involved in the control loop, there are various uncertainties in the system, including variable iteration period in the task under study. The stretch and compression mapping is applied to this problem. In the simulation, the proposed scheme is implemented in the human-robot interaction scenario. Results confirm the effectiveness of the proposed scheme and also illustrate better performance of the proposed ILC compared with other ILC methods with variable periods
    • …
    corecore